Low-Latency Communication on the IBM RISC
نویسندگان
چکیده
The IBM SP is one of the most powerful commercial MPPs, yet, in spite of its fast processors and high network bandwidth, the SP's communication latency is inferior to older machines such as the TMC CM-5 or Meiko CS-2. This paper investigates the use of Active Messages (AM) communication primitives as an alternative to the standard message passing in order to reduce communication overheads and to o er a good building block for higher layers of software. The rst part of this paper describes an implementation of Active Messages (SP AM) which is layered directly on top of the SP's network adapter (TB2). With comparable bandwidth, SP AM's low overhead yields a round-trip latency that is 40% lower than IBM MPL's. The second part of the paper demonstrates the power of AM as a communication substrate by layering Split-C as well as MPI over it. Split-C benchmarks are used to compare the SP to other MPPs and show that low message overhead and high throughput compensate for SP's high network latency. The MPI implementation is based on the freely available MPICH version and achieves performance equivalent to IBM's MPI-F on the NAS benchmarks.
منابع مشابه
Seamless - a latency-tolerant RISC-based multiprocessor architecture
A recurrent problem posed by parallel processing architectures is that of communication latency. In this work, the latency problem is presented with special emphasis given to RISC based multiprocessors. An interprocessor communication model for parallel programs based on locality is presented. This model enables the programmer to manipulate locality at the language level and to take advantage o...
متن کاملDesign of a Low-Latency Router Based on Virtual Output Queuing and Bypass Channels for Wireless Network-on-Chip
Wireless network-on-chip (WiNoC) is considered as a novel approach for designing future multi-core systems. In WiNoCs, wireless routers (WRs) utilize high-bandwidth wireless links to reduce the transmission delay between the long distance nodes. When the network traffic loads increase, a large number of packets will be sent into the wired and wireless links and can...
متن کاملLeading-Zero Anticipator (LZA) in the IBM RISC System/6000 Floating-Point Execution Unit
This paper presents a novel technique used in the multiply-add-fused (MAF) unit of the IBM RlSC System/6000* (RS/6000) processor for normalizing the floating-point results. Unlike the conventional procedures applied thus far, the so-called leading-zero anticipator (LZA) of the RS/SOOO carries out processing of the leading zeros and ones in parallel with floating-point addition. Therefore, the n...
متن کاملApplication Mapping onto Network-on-Chip using Bypass Channel
Increasing the number of cores integrated on a chip and the problems of system on chips caused to emerge networks on chips. NoCs have features such as scalability and high performance. NoCs architecture provides communication infrastructure and in this way, the blocks were produced that their communication with each other made NoC. Due to increasing number of cores, the placement of the cores i...
متن کاملPerformance and Experience with LAPI - a New High-Performance Communication Library for the IBM RS/6000 SP
LAPI is a low-level, high-performance communication interface available on the IBM RS/6000 SP system. It provides an activemessage-like interface along with remote memory copy and synchronization functionality. It is designed primarily for use by experienced programmers in developing parallel subsystems, libraries and tools, but we also expect power programmers to use it in end-user application...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996